Open Data Platform for Knowledge Access in Plant Health Domain : VESPA Mining

نویسندگان

  • Nicolas Turenne
  • Mathieu Andro
  • Roselyne Corbière
  • Tien T. Phan
چکیده

Important data are locked in ancient literature. It would be uneconomic to produce these data again and today or to extract them without the help of text mining technologies. Vespa is a text mining project whose aim is to extract data on pest and crops interactions, to model and predict attacks on crops, and to reduce the use of pesticides. A few attempts proposed an agricultural information access. We can find systems with sociosemantic approach (Turenne & Barbier, 2004), or systems with thesaurus-oriented approach like (Milne et al, 2006; Bartol, 2009; Vakkari, 2010; Laporte et al, 2012). But these systems are not driven by concrete usages like the Vespa Mining platform. Another originality of our work is to parse documents with a dependency of the document architecture (sections: chapters, subchapters...). Lots of works has been done to retrieve named entities like genes, persons, organization... in texts with a good efficiency (Riloff, 1996; Roth & Yih, 2002; Carpenter, 2007; Surdeanu et al, 2011; Krieger et al, 2014). Some state of the art tools are even available (http_Lingpipe, 2015; http_SNER, 2015) but in these approaches, entities are searched in a sentence and relations are often searched in a same sentence. Our approach is different, we search relations in a same section. Our approach (Turenne & Phan, 2015) is based on dictionary matching and on rules defined by users using a tool called Unitex (htt_Unitex, 2015) to detect entities and document design together with cooccurrence analysis to detect relation. High efficiency relation extraction reduces time of browsing in documents to identify relevant information.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

PsyGeNET: a knowledge platform on psychiatric disorders and their genes

UNLABELLED PsyGeNET (Psychiatric disorders and Genes association NETwork) is a knowledge platform for the exploratory analysis of psychiatric diseases and their associated genes. PsyGeNET is composed of a database and a web interface supporting data search, visualization, filtering and sharing. PsyGeNET integrates information from DisGeNET and data extracted from the literature by text mining, ...

متن کامل

Identifying Platform-Based Services in Iran’s Public Libraries

Abstract Purpose: The purpose of this study is to identify platform-based services and their priorities in order to apply them in Iran’s public libraries. Method: This research is an applied study in terms of purpose, including a mixed approach in terms of the research framework. First, the platform-based services were identified through using the meta-synthesis method. Then, after searching ...

متن کامل

Query Architecture Expansion in Web Using Fuzzy Multi Domain Ontology

Due to the increasing web, there are many challenges to establish a general framework for data mining and retrieving structured data from the Web. Creating an ontology is a step towards solving this problem. The ontology raises the main entity and the concept of any data in data mining. In this paper, we tried to propose a method for applying the "meaning" of the search system, But the problem ...

متن کامل

Data Mining: A Novel Outlook to Explore Knowledge in Health and Medical Sciences

Today medical and Healthcare industry generate loads of diverse data about patients, disease diagnosis, prognosis, management, hospitals’ resources, electronic patient health records, medical devices and etc. Using the most efficient processing and analyzing method for knowledge extraction is a key point to cost-saving in clinical decision making. Data mining, sometimes called data or knowledge...

متن کامل

A field investigation of application of digital terrestrial photogrammetry to characterize geometric properties of discontinuities in open-pit slopes

In order to analyze the slope stability in open-pit mines, the structural parameters of rock mass such as persistence and spatial orientation of discontinuities are characterized through field surveys, which involve spending high costs and times as well as posing high risks of rock toppling and rock fall. In the present work, a new application of terrestrial digital photogrammetry is introduced...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:
  • CoRR

دوره abs/1504.06077  شماره 

صفحات  -

تاریخ انتشار 2015